Variable-Sized Map and Locality-Aware Reduce on Public-Resource Grids
نویسندگان
چکیده
This paper presents a grid-enabled MapReduce framework called “Ussop”. Ussop provides its users with a set of C-language based MapReduce APIs and an efficient runtime system for exploiting the computing resources available on public-resource grids. Considering the volatility nature of the grid environment, Ussop introduces two novel task scheduling algorithms, namely: Variable-Sized Map Scheduling (VSMS) and Locality-Aware Reduce Scheduling (LARS). VSMS dynamically adjusts the size of the map tasks according to the computing power of grid nodes. Moreover, LARS minimizes the data transfer cost of exchanging the intermediate data over a wide-area network. The experimental results indicate that both VSMS and LARS achieved superior performance than the conventional scheduling algorithms.
منابع مشابه
Predoop: Preempting Reduce Task for Job Execution Accelerations
Map/Reduce is a popular parallel processing framework for data intensive computing. For overlapping the Map task’s execution phase and the Reduce task’s intermediate data fetching and merging phase, existing Map/Reduce schedulers always pre-launch the Reduce task at the specific threshold where its map tasks have been launched, and this pattern incurs the occupation of the consuming resources o...
متن کاملScalable community-driven data sharing in e-science grids
E-science projects of various disciplines face a fundamental challenge: thousands of users want to obtain new scientific results by applicationspecific and dynamic correlation of data from globally distributed sources. Considering the involved enormous and exponentially growing data volumes, centralized data management reaches its limits. Since scientific data are often highly skewed and explor...
متن کاملEffect of green human resource management practices on environmental sustainability
In today’s world, green human resource management is one of the most important factors in forward-thinking your environment-friendly business. Most of the researchers are of the view that employees must be empowered and environmentally aware of greening while carrying out green human resource management practices. The present study is examining the impact of different Green human resource prac...
متن کاملAdaptive Dynamic Data Placement Algorithm for Hadoop in Heterogeneous Environments
Hadoop MapReduce framework is an important distributed processing model for large-scale data intensive applications. The current Hadoop and the existing Hadoop distributed file system’s rack-aware data placement strategy in MapReduce in the homogeneous Hadoop cluster assume that each node in a cluster has the same computing capacity and a same workload is assigned to each node. Default Hadoop d...
متن کاملBoosting MapReduce with Network-Aware Task Assignment
Running MapReduce in a shared cluster has become a recent trend to process large-scale data analytics applications while improving the cluster utilization. However, the network sharing among various applications can lead to constrained and heterogeneous network bandwidth available for MapReduce applications. This further increases the severity of network hotspots in racks, and makes existing ta...
متن کامل